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Abstract 

This  paper  presents  an  approach  for  the  Opin¬ 
ion  Finding  task  at  TREC  2008  Blog  Track. 

For  the  Ad-hoc  Retrieval  subtask,  we  adopt 
language  model  to  retrieve  relevant  docu¬ 
ments.  For  the  Opinion  Retrieval  sub  task,  we 
propose  a  hybrid  model  of  lexicon-based  ap¬ 
proach  and  machine  learning  approach  for  es¬ 
timating  and  ranking  the  opinionated  docu¬ 
ments.  For  the  Polarized  Opinion  Retrieval 
subtask,  we  employ  machine  learning  for  pre¬ 
dicting  the  polarity  and  linear  combination 
technique  for  ranking  polar  documents.  The 
hybrid  model  which  utilize  both  lexicon-based 
approach  and  machine  learning  approach  to 
predict  and  rank  opinionated  documents  are 
the  focuses  of  our  participation  this  year.  Re¬ 
garding  the  hybrid  method  for  opinion  re¬ 
trieval  subtask,  our  submitted  runs  yield  15% 
improvement  over  baseline. 

1  Introduction 

TREC  2008  Blog  Track  defines  two  main  tasks:  the 
Opinion  Finding  and  the  Blog  Distillation.  We  par¬ 
ticipated  in  the  Opinion  Finding  task  which  is  split 
into  three  separate  subtasks  (our  system  structure  for 
the  retrieval  subtasks  is  shown  in  Figure  1). 

The  first  subtask.  Ad-hoc  Retrieval,  involves  find¬ 
ing  blog  posts  which  contain  relevant  information 
about  a  given  topic.  To  be  considered  as  baseline, 
this  subtask  is  supposed  to  turn  off  all  opinion  find¬ 
ing  features.  In  order  to  concentrate  on  the  opinion 
finding  methods,  we  simply  adopted  out-of-the-box 
models  supported  by  Lemur  toolkit1  for  this  subtask. 

1  http://www.lemurproject.org 


According  to  our  empirical  runs,  language  model 
turned  out  to  be  relatively  better  than  vector  space 
model  for  Ad-hoc  Retrieval.  We  also  applied  sev¬ 
eral  techniques  for  query  expansion  but  neither  pro¬ 
posed  cluster-based  expanding  nor  classic  pseudo¬ 
relevance  feedback  did  not  help  to  improve  the  re¬ 
trieval  performance. 

The  second  subtask.  Opinion  Retrieval,  involves 
locating  blog  posts  that  express  an  opinion  about  a 
given  topic.  The  relevant  blog  post  must  have  an 
opinion  presented  in  the  post  or  in  one  of  its  com¬ 
ments.  To  deal  with  this  subtask,  we  structured  our 
approach  as  a  two-step  process,  including  detecting 
opinionated  documents  and  ranking  them.  To  de¬ 
tect  the  opinionated  blog  posts,  we  employed  both 
lexicon-based  approach  (LE)  and  machine  learning 
approach  (ML).  The  opinion  scores  estimated  by  LE 
and  ME  are  then  linearly  combined  with  topic  rele¬ 
vance  score  to  produce  the  final  ranking  for  detected 
documents.  For  this  two-step  process,  we  proposed 
a  hybrid  method  utilizing  LE  and  ML  at  both  the 
detecting  step  and  the  ranking  step.  In  addition, 
we  applied  spam  filtering  as  a  post-processing  step 
of  the  Opinion  Retrieval  subtask,  that  conducted  a 
marginal  improvement  of  retrieval  performance. 

The  Polarized  Opinion  Retrieval  subtask  involves 
not  only  locating  positive  (resp.  negative)  opinion¬ 
ated  blog  posts  but  also  ranking  them  according  to 
the  degree  of  polarity.  Ranking  is  a  new  requirement 
for  this  subtask  in  2008.  To  deal  with  requirement 
of  excluding  mixed  documents  (i.e,  ones  contain 
both  positive  and  negative  opinions),  we  trained  a 
ternary  classifier  to  classify  opinionated  documents 
into  positive,  negative,  and  mixed  class.  After  re- 
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Figure  1 :  System  structure  of  Opinion  Finding  task 


moving  mixed  documents,  we  retained  positive  and 
negative  documents  for  ranking.  Ranking  score  is 
the  linear  combination  of  the  polar  score  predicted 
by  the  ternary  classifier  and  the  opinion  score  esti¬ 
mated  from  the  Opinion  Retrieval  subtask. 


2  Ad-hoc  Retrieval 

2.1  Indexing 

The  Blog06  collection  is  used  again  in  TREC  2008. 
For  all  the  three  mentioned  subtasks,  we  indexed 
only  permalinks  as  the  retrieval  component.  At  the 
pre-processing  step,  we  discarded  all  F1TML  tags 
and  redundant  scripts  from  permalink  documents. 
Porter  stemming  and  standard  stop-word  remov¬ 
ing  are  early  applied  for  indexing  but  they  slightly 
harmed  the  preliminary  retrieval  results.  Therefore, 
we  did  not  use  neither  stemming  nor  stop-word  re¬ 
moval  at  the  final  indexing  version. 


2.2  Query  Processing 

Query  expansion  is  the  process  of  adding  terms  to 
the  original  search  query.  To  study  the  effectiveness 
of  query  expansion  for  Ad-hoc  Retrieval,  we  empir¬ 
ically  exploited  two  simple  techniques  as  follows: 

Cluster-based  analysis:  The  first  technique  for 
expanding  query  comes  from  the  hypothesis  that 
clustering  can  provide  retrieval  extra  information. 
We  deployed  this  idea  by  using  Clusty  search  en¬ 
gine2.  Clusty  is  a  text  clustering  search  engine  al¬ 
lows  grouping  search  results  into  folder  topics.  Em¬ 
pirically,  we  issue  an  original  query  to  Clusty  and 
take  the  titles  of  top  n  clusters3  as  expanded  terms. 

Pseudo-relevance  feedback:  The  intuition  of 
pseudo-relevance  feedback  is  that  top  ranked  re¬ 
trieved  documents  would  be  the  most  relevant  ones 
to  the  given  query.  Consequently,  those  documents 
might  contain  terms  which  are  highly  related  to  the 
topic.  For  simply  examining  this  method,  we  se- 

2http://clusty.com 

3  In  Clusty,  clusters  are  sorted  by  their  number  of  documents 


Cluster-based  analysis 

Pseudo-relevance  feedback 

Using  topic’s  description 

Windows  Vista  Microsoft 
Software  Operating  system 
( top  3  clusters’s  title  are 
added  to  the  original  query) 

Windows  Vista  Microsoft 
Software  beta  ( top  3  most  fre¬ 
quent  terms  of  top  50  ranked 
retrieved  documents  are  added 
to  the  original  query) 

Windows  Vista  Find 

opinion  Microsoft  operating 
system  any  features 

Table  1:  Example  of  processed  queries  for  the  original  query  Windows  Vista. 


lected  n  most  frequent  terms  in  top  ranked  docu¬ 
ments  and  add  them  to  the  original  query. 

Using  topic  description  has  been  reported  to  be 
marginally  beneficial  to  opinion  finding  task  (Ounis 
et  ah,  2006).  For  this  analysis,  we  used  terms  in  the 
topic  description  after  removing  standard  stop  words 
as  expanded  terms  for  the  original  query.  Given 
query  Windows  Vista,  Table  1  shows  an  exam¬ 
ple  of  processed  queries. 

2.3  Language  Model 

As  mentioned  earlier,  we  empirically  adopted  sev¬ 
eral  out-of-the-box  models  for  Ad-hoc  Retrieval. 
According  to  our  experiments,  Kullback-Leibler 
(KL)  divergence  was  shown  to  be  the  best- 
performed  model  comparing  to  other  ones.  KL 
is  a  statistical  language  model  which  scores  and 
ranks  documents  by  the  KL-divergence  (i.e,  rela¬ 
tive  entropy)  between  the  query  language  model  and 
the  document  language  model  (Lafferty  and  Zhai, 
2001).  Since  ranking  documents  is  of  interests,  KL- 
divergence  is  rewritten  as  the  cross  entropy  of  the 
query  model  with  respect  to  the  document  model. 
For  the  reason  of  performance,  we  used  KL  model 
for  both  of  two  submitted  baseline  runs.  Addition¬ 
ally,  we  applied  Bayesian  smoothing  method  using 
Dirichlet  priors  with  default  prior  parameter  set  to 
1000. 

3  Opinion  Retrieval 

So  far,  two  effective  approaches  to  detect  opinion¬ 
ated  documents  are  lexicon-based  approach  and  ma¬ 
chine  learning  approach.  In  this  work,  we  employed 
both  of  these  two  approaches  for  identifying  opin¬ 
ionated  blog  posts  and  ranking  them.  The  intuition 
here  is  that  individual  approach  likely  retrieves  the 
different  set  of  documents;  the  combined  system 
hence  performed  better  due  to  the  increase  of  recall. 


3.1  Lexicon-based  Approach 

In  general,  lexicon-based  approach  stalls  at  con¬ 
structing  a  dictionary  of  terms  indicating  opinion. 
The  opinionated  documents  are  then  decided  if  opin¬ 
ion  terms  occur  in  those  documents.  Following  the 
general  framework,  we  first  compiled  a  list  of  opin¬ 
ion  words  from  several  sources: 

General  Inquirer:  General  Inquirer  (Stone  et  al., 
1966)  is  a  manually-constructed  lexicon  that  con¬ 
sists  of  many  semantic  and  emotional  categories. 
We  selected  the  words  within  opinion-related  cat¬ 
egories4  and  added  them  to  the  final  opinion-word 
list. 

Word  Net:  Starting  with  a  small  set  of  annotated 
words  (seeds),  we  iteratively  looked  for  synonyms 
and  antonyms  of  seeds  in  the  WordNet  (Miller, 
1992).  At  each  iteration,  the  newly  found  words 
were  added  to  the  seed  list  for  the  next  searching 
iteration.  This  process  was  stopped  when  there  was 
no  new  word  to  be  found.  Final  expanded  word  set 
was  concatenated  to  the  opinion-word  list. 

Wilson  Word  Set:  This  word  set  is  constructed 
by  Wilson  (Wilson  et  al.,  2003),  which  includes  sub¬ 
jective  clues  for  identifying  opinionated  sentences. 
We  concatenate  this  word  set  to  the  final  opinion- 
word  list. 

After  removing  duplicate  words  from  the  fi¬ 
nal  opinion-word  list,  we  obtained  a  dictionary  of 
29,876  words.  The  weights  of  words  were  trained 
on  the  assessment  for  100  topics  in  2006  and  2007. 
Empirically,  the  assessed  blog  posts  were  split  into 
opinionated  (O)  and  non-opinionated  (N)  set.  Each 
word  in  the  dictionary  was  then  assigned  a  weight 
given  by: 


4We  considered  the  categories  Positive,  Negative, 
Arousal,  Emotion,  Feel,  Pain,  Pleasure, 
Virtue,  Self,  Our  as  opinion-related  ones 


weight  (wi ) 


P(Wi\0)  -  P(Wi\N) 
P(Wi\0)  +  P(Wi\N) 


(1) 


Equation  1  is  inherited  from  (Dave  et  al.,  2003) 
where  P(wi\0)  (resp.,  P(Wi\N))  is  estimated  by  the 
occurrences  of  w;  in  O  (resp.,  N)  over  the  occur¬ 
rences  of  all  tokens  in  O  (resp.,  N). 

Using  the  weighted  dictionary,  a  document  re¬ 
trieved  from  ad-hoc  retrieval  is  scored  as  follows: 


score  LE{d)  = 


E 

WiEd 


tfidfl  Wi).weight(  Wj ) 
nearest(Wi,  topic) 


(2) 


where  nearestf  uy,  topic)  is  the  distance  (by  word)  be¬ 
tween  opinion  word  vry  and  the  original  topic.  Docu¬ 
ment  d  is  determined  to  be  opinionated  if  score  pE(d) 
is  positive  and  non-opinionated  otherwise. 

3.2  Machine  Learning  Approach 

In  the  context  of  machine  learning,  detecting  opin¬ 
ionated  documents  can  be  considered  as  a  binary 
classification  task.  We  thus  train  a  binary  classifier 
on  the  last-two-year  assessment.  SVMhff/lt  package5 
is  employed  for  the  classification  task.  Further  be¬ 
lieving  that  machine  learning  can  benefit  from  the 
opinion-indicated  terms,  we  used  opinion  words  in 
the  compiled  dictionary  for  LE  instead  of  all  lexical 
unigrams  as  training  features. 

3.3  A  Hybrid  Method 

Opinion  Retrieval  subtask  requires  ranking  opinion¬ 
ated  documents  after  identifying  their  subjectivity. 
Considering  that  the  opinion  retrieval  subtask  con¬ 
tains  classifying  step  and  ranking  step,  we  adopt  a 
hybrid  method  for  each  step.  At  classifying  step,  the 
documents  classified  as  non-opinionated  by  both  of 
LE  and  ML  are  removed.  At  ranking  step,  both  LE 
and  ML  scores  are  combined  with  topic  relevance 
score  to  produce  final  ranking  score  as  below: 


scoreoR  =  Ai  scorer  r  +  A2  score  le  +  A  ^scoreML 

(3) 

where  scorepR,  score  re,  score  ml  are  the  topic  rele¬ 
vance  score  of  ad-hoc  retrieval,  the  opinion  score  es¬ 
timated  by  equation  2  and  the  output  score  of  binary 


SVM  classifier  respectively.  In  our  experiments,  the 
weights  for  each  component  score  arc  heuristically 
tuned  to  maximize  the  MAP  of  opinion  ranking. 

3.4  Spam  Filtering 

Since  splogs  cause  negative  effects  to  retrieval,  we 
adopt  spam  filtering  method  similar  to  the  one  in 
(Mishne,  2006).  We  trained  naive  Bayesian  classi¬ 
fier  to  detect  splogs.  The  training  spam  data  was 
collected  by  querying  casino  to  Google  web  search 
engine6. 

4  Polarized  Opinion  Retrieval 

Polarized  Opinion  Retrieval  can  be  referred  to  polar¬ 
ity  classification  subtask  in  Blog  Track  2007  (Mac¬ 
donald  et  ah,  2007).  Ranking  polar  opinionated  doc¬ 
uments  [positive  and  negative  ones)  is  a  new  issue 
this  year.  As  mentioned  earlier,  although  mixed  doc¬ 
uments  arc  not  required  to  be  retrieved,  identifying 
them  is  deemed  to  make  the  retrieval  accurate.  Due 
to  this  intuition,  we  employed  machine  learning  ap¬ 
proach  for  dealing  with  this  task  since  lexicon-based 
approach  turned  out  to  be  poor-performed  at  prelim¬ 
inary  experiments. 

4.1  Classifying  Polar  Opinionated  Documents 

In  our  experiments,  we  used  SVMmultelass  in  SVM 
package  for  training  a  ternary  classifier.  The  training 
corpus  was  the  last-two-year  assessments  for  100 
topics.  The  features  for  classification  were  still  opin¬ 
ion  words  in  the  compiled  dictionary  mentioned  in 
section  3.1.  At  the  classifying  step,  trained  classi¬ 
fier  categorizes  opinionated  documents  into  positive, 
negative,  and  mixed  category. 

4.2  Ranking  Polar  Opinionated  Documents 

At  ranking  step,  mixed  opinionated  documents  arc 
ruled  out,  only  positive  and  negative  opinionated 
ones  arc  retained  for  ranking.  The  final  ranking 
score  is  the  linear  combination  of  opinion  ranking 
score  and  polar  score  of  ternary  SVM  classifier. 


scorepQR  =  \4sc0re0R  +  (1  -  \4).scoresvM  (4) 


5http://svmlight, joachims.org/ 


6http://www.google.com 


Run 

Description 

MAP 

R-prec 

P@10 

kunlpKLtt 

kunlpKLtd 

title  only 
title+description 

0.2713 

0.2666 

0.3544 

0.3465 

0.5800 

0.6567 

Table  2:  Results  of  Ad-hoc  Retrieval. 


Expansion  techniques 

MAP 

R-prec 

P@10 

None  (title  only) 

0.2713 

0.3544 

0.5800 

Cluster-based  analysis 

0.2089 

0.2919 

0.5240 

Local-relevance  feedback 

0.2105 

0.2841 

0.4720 

Table  3:  Effects  of  query  expansions  for  Ad-hoc  Retrieval. 


Run 

Description 

MAP 

R-prec 

P@10 

kunlpKLtt 

Baseline 

0.1991 

0.2799 

0.3820 

kunlpKLttOs 

Hybrid  model  (Ai  =  0.3,  A2  =  0,  A3  =  0.7) 

0.2234 

0.3045 

0.5553 

kunlpKLttOc 

Hybrid  model  (Ai  =  0.3,  A2  =  0.1,  A3  =  0.6) 

0.2285 

0.3138 

0.5600 

kunlpKLtd 

Baseline 

0.1953 

0.2739 

0.  4553 

kunlpKLtdOs 

Hybrid  model  (Ai  =  0.3,  A2  =  0,  A3  =  0.7) 

0.2186 

0.3030 

0.5500 

kunlpKLtdOc 

Hybrid  model  (Ai  =  0.3,  A2  =  0.1,  A3  =  0.6) 

0.2191 

0.3037 

0.5620 

Table  4:  Results  of  Opinion  Retrieval. 


Spam 

Average 

Average 

Average 

filtering 

MAP 

R-prec 

P@10 

No 

0.2193 

0.3013 

0.5400 

Yes 

0.2224 

0.3069 

0.5563 

Table  5:  Effects  of  spam  filtering  for  Opinion  Retrieval. 


Run 

Description 

MAP 

R-prec 

P@10 

kunlpKLttPc 

Positive  ranking  (A4  =  0.8) 
Negative  ranking  (A4  =  0.8) 

0.1454 

0.1229 

0.2153 

0.1853 

0.3329 

0.2754 

kunlpKLtdPc 

Positive  ranking  (A4  =  0.8) 
Negative  ranking  (A4  =  0.8) 

0.1361 

0.1234 

0.2086 

0.1846 

0.3262 

0.2718 

Table  6:  Results  of  Polarized  Opinion  Retrieval. 


5  Results  and  Submissions 

We  submitted  two  baseline  runs  for  Ad-hoc  Re¬ 
trieval  subtask.  Both  of  them  were  based  on  KL- 
divergence  model.  Whereas  kunlpKLtt  is  the 
title-only  run,  kunlpKLtd  is  the  title-description 
run.  Table  2  shows  the  evaluation  of  these  two  base¬ 
line  runs  for  150  topics  from  Blog  Track  2006  to 
2008.  Interestingly  note  that  using  description  is 
effective  to  improve  the  early  precision  of  ad-hoc 
retrieval  (P@10  is  boosted  at  13.2%  from  0.58  to 


0.6567).  Additionally,  table  3  demonstrates  the  ef¬ 
fects  of  query  expansion  techniques  for  ad-hoc  re¬ 
trieval.  It  can  be  shown  that  neither  cluster-based 
analysis  nor  pseudo-relevance  feedback  helped  in¬ 
creasing  performance. 

From  each  baseline  run,  we  generated  two  runs 
based  on  the  hybrid  model  for  Opinion  Retrieval 
subtask  (in  two  runs,  parameter  A  is  empirically 
adapted  to  optimize  MAP  of  opinion  ranking).  Ta¬ 
ble  4  shows  the  evaluation  of  four  submissions  after 


filtering  splogs.  The  experimental  results  showed 
that  the  proposed  hybrid  method  remarkably  im¬ 
proved  opinion  retrieval  performance  over  baseline 
(the  best  run  boosted  MAP  at  14.8%  from  0.1991 
to  0.2285).  In  addition,  spam  filtering  conducted  a 
marginal  contribution  for  opinion  retrieval  (our  ex¬ 
periment  resulted  in  an  average  increase  of  MAP  at 
1%,  as  shown  in  table  5). 

For  the  Polarized  Opinion  Retrieval,  we  sub¬ 
mitted  two  runs  corresponding  to  the  two  best 
runs  of  Opinion  Retrieval  (i.e.,  kunlpKLttOc  and 
kunlpKLtdOc).  Table  6  shows  the  evaluation  of 
these  two  runs  with  respect  to  positive  ranking  and 
negative  ranking.  It  can  be  shown  in  detail  from  ta¬ 
ble  6  that  Polarized  Opinion  Retrieval  is  strongly 
dominated  by  the  underlying  Opinion  Retrieval  re¬ 
sults. 

6  Conclusion 

This  paper  described  our  approaches  for  Opinion 
Finding  task  at  TREC  Blog  Track  2008.  For  Ad-hoc 
Retrieval,  KL-divergence  model  achieved  the  best 
results  among  the  others.  Utilizing  description  of 
topic  was  shown  to  be  helpful  for  boosting  early  pre¬ 
cision  of  Ad-hoc  Retrieval.  For  Opinion  Retrieval, 
hybrid  model  of  lexicon-based  approach  and  ma¬ 
chine  learning  approach  was  proposed  to  detect  and 
rank  opinionated  documents.  Spam  filtering  slightly 
helped  improving  the  performance  of  Opinion  Re¬ 
trieval.  For  Polarized  Opinion  Retrieval,  machine 
learning  approach  was  employed  for  predicting  po¬ 
lar  opinionated  documents.  Polar  score  estimated  by 
classifier  was  then  linearly  combined  with  opinion 
score  to  produce  the  final  ranking  score  of  those  po¬ 
lar  opinionated  documents. 
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